Decision-Rule Solutions for Data Mining with Missing Values
نویسندگان
چکیده
A method is presented to induce decision rules from data with missing values where (a) the format of the rules is no di erent than rules for data without missing values and (b) no special features are speci ed to prepare the the original data or to apply the induced rules. This method generates compact Disjunctive Normal Form (DNF) rules. Each class has an equal number of unweighted rules. A new example is classied by applying all rules and assigning the example to the class with the most satis ed rules. Disjuncts in rules are naturally overlapping. When combined with voted solutions, the inherent redundancy is enhanced. We provide experimental evidence that this transparent approach to classication can yield strong results for data mining with missing values.
منابع مشابه
Performance evaluation of different estimation methods for missing rainfall data
There are numerous methods to estimate missing values of which some are used depending on the data type and regional climatic characteristics. In this research, part of the monthly precipitation data in Sarab synoptic station, east Azerbaijan province, Iran was randomly considered missing values. In order to study the effectiveness of various methods to estimate missing data, by seven classic s...
متن کاملMachine Learning Based Missing Value Imputation Method for Clinical Dataset
Missing value imputation is one of the biggest tasks of data pre-processing when performing data mining. Most medical datasets are usually incomplete. Simply removing the cases from the original datasets can bring more problems than solutions. A suitable method for missing value imputation can help to produce good quality datasets for better analysing clinical trials. In this paper we explore t...
متن کاملFuzzy Unordered Rules Induction Algorithm Used as Missing Value Imputation Methods for K-Mean Clustering on Real Cardiovascular Data
Missing value imputation is one of the biggest tasks of data pre-processing when performing data mining. Most medical datasets are usually incomplete. Simply removing the cases from the original datasets can bring more problems than solutions. A suitable method for missing value imputation can help to produce good quality datasets for better analysing clinical trials. In this paper we explore t...
متن کاملA Comparative Study on Decision Rule Induction for incomplete data using Rough Set and Random Tree Approaches
Handling missing attribute values is the greatest challenging process in data analysis. There are so many approaches that can be adopted to handle the missing attributes. In this paper, a comparative analysis is made of an incomplete dataset for future prediction using rough set approach and random tree generation in data mining. The result of simple classification technique (using random tree ...
متن کاملHandling Missing Attribute Values
In this chapter methods of handling missing attribute values in data mining are described. These methods are categorized into sequential and parallel. In sequential methods, missing attribute values are replaced by known values first, as a preprocessing, then the knowledge is acquired for a data set with all known attribute values. In parallel methods, there is no preprocessing, i.e., knowledge...
متن کامل